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Abstract 

We design the weights in consensus algorithms with spatially correlated random topologies. These 
arise with: 1) networks with spatially correlated random link failures and 2) networks with randomized 
averaging protocols. We show that the weight optimization problem is convex for both synmietric and 
asymmetric random graphs. With symmetric random networks, we choose the consensus mean squared 
error (MSE) convergence rate as optimization criterion and explicitly express this rate as a function of 
the link formation probabihties, the Unk formation spatial correlations, and the consensus weights. We 
prove that the MSE convergence rate is a convex, nonsmooth function of the weights, enabhng global 
optimization of the weights for arbitrary link formation probabihties and Unk correlation structures. We 
extend our results to the case of asymmetric random links. We adopt as optimization criterion the mean 
squared deviation (MSdev) of the nodes' states from the current average state. We prove that MSdev is 
a convex function of the weights. Simulations show that significant performance gain is achieved with 
our weight design method when compared with methods available in the literature. 
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I. Introduction 

This paper finds tiie optimal weights for the consensus algorithm in correlated random networks. 
Consensus is an iterative distributed algorithm that computes the global average of data distributed among 
a network of agents using only local commuiucations. Consensus has renewed interest in distributed 
algorithms ([1], [2]), arising in many different areas from distributed data fusion ([3], [4], [5], [6], [7]) 
to coordination of mobile autonomous agents ([8], [9]). A recent survey is [10]. 

This paper studies consensus algorithms in networks where the links (being onUne or off line) are 
random. We consider two scenarios: 1) the network is random, because links in the network may fail at 
random times; 2) the network protocol is randomized, i.e., the link states along time are controlled by 
a randomized protocol (e.g., standard gossip algorithm [11], broadcast gossip algorithm [12]). In both 
cases, we model the links as Bernoulli random variables. Each link has some formation probability, i.e., 
probability of being active, equal to Pij. Different links may be correlated at the same time, which can 
be expected in real applications. For example, in wireless sensor networks (WSNs) links can be spatially 
correlated due to interference among close links or electromagnetic shadows that may affect several 
nearby sensors. 

References on consensus under time varying or random topology are ([13], [10], [14]) and ([15], 
[16], [17], [18], [12]), among others, respectively. Most of the previous work is focussed on providing 
convergence conditions and/or characterizing the convergence rate under different assumptions on the 

network randomness ([17], [16], [18]). For example, references [16] and [19] study consensus algorithm 
with spatially and temporally independent link failures. They show that a necessary and sufficient 
condition for mean squared and almost sure convergence is for the communication graph to be connected 
on average. 

We consider here the weight optimization problem: how to assign the weights Wij with which the nodes 
mix their states across the network, so that the convergence towards consensus is the fastest possible. 
This problem has not been solved (with full generaUty) for consensus in random topologies. We study 
this problem for networks with symmetric and asymmetric random links separately, since the properties 
of the corresponding algorithm are different. For symmetric links (and connected network topology on 
average), the consensus algorithm converges to the average of the initial nodes' states almost surely. For 
asymmetric random links, all the nodes asymptotically reach agreement, but they only agree to a random 
variable in the neighborhood of the true initial average. 

We refer to our weight solution as probability-based weights (PBW). PBW are simple and suitable 
for distributed implementation: we assume at each iteration that the weight of Unk is Wij (to be 
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optimized), when the hnk is ahve, or 0, otherwise. Self-weights are adapted such that the row-sums of the 
weight matrix at each iteration are one. This is suitable for distributed implementation. Each node updates 
readily after receiving messages from its current neighbors. No information about the number of nodes 
in the network or the neighbor's current degrees is needed. Hence, no additional online communication is 
required for computing weights, in contrast, for instance, to the case of the Metropolis weights (MW) [14]. 

Our weight design method assumes that the hnk formation probabilities and their spatial correlations 
are known. With randomized protocols, the hnk formation probabilities and their correlations are induced 
by the protocol itself, and thus are known. For networks with random link failures, the hnk formation 
probabilities relate to the signal to noise ratio at the receiver and can be computed. In [20], the formation 
probabilities are designed in the presence of link communication costs and an overall network communi- 
cation cost budget. When the WSN infrastructure is known, it is possible to estimate the link formation 
probabilities by measuring the reception rate of a link computed as the ratio between the number of 
received and the total number of sent packets. Another possibility is to estimate the link formation 
probabilities based on the received signal strength. Link formation correlations can also be estimated 
on actual WSNs, [21]. If there is no training period to characterize quantitatively the hnks on an actual 
WSN, we can still model the probabilities and the correlations as a function of the transmitted power 
and the inter-sensor distances. Moreover, several empirical studies ([21], [22] and references therein) on 
the quantitative properties of wireless communication in sensor networks have been done that provide 
models for packet delivery performance in WSNs. 

Summary of the paper. Section II lists our contributions, relate them with the existing literature, and 
introduces notation used in the paper. Section III describes our model of random networks and the 
consensus algorithm. Sections IV and V study the weight optimization for symmetric random graphs 
and asymmetric random graphs, respectively. Section VI demonstrates the effectiveness of our approach 
with simulations. Finally, section VII concludes the paper. We derive the proofs of some results in the 
Appendices A through C. 

II. Contribution, Related Work, and Notation 

Contribution. Building our results on the previous extensive studies of convergence conditions and 

rates for consensus algorithm (e.g. ,[12], [15], [20]), we address the problem of weights optimization 
in consensus algorithms with correlated random topologies. Our method is applicable to: 1) networks 
with correlated random link failures (see, e.g., [20] and 2) networks with randomized algorithms (see, 
e.g, [11], [12]). We first address the weight design problem for symmetric random hnks, and then extend 
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the results to asymmetric random links. 

With symmetric random links, we use the mean squared consensus convergence rate (j){W) as the 
optimization criterion. We explicitly express the rate (l){W) as a function of the link formation prob- 
abilities, their correlations, and the weights. We prove that ipCW) is a convex, nonsmooth function of 
the weights. This enables global optimization of the weights for arbitrary link formation probabilities 
and and arbitrary link correlation structures. We solve numerically the resulting optimization problem by 
subgradient algorithm, showing also that the optimization computational cost grows tolerably with the 
network size. We provide insights into weight design with a simple example of complete random network 
that admits closed form solution for the optimal weights and convergence rate and show how the optimal 
weights depend on the number of nodes, the link formation probabilities, and their correlations. 

We extend our results to the case of asymmetric random links, adopting as an optimization criterion 
the mean squared deviation (from the current average state) rate 'ip{W), and show that 'ip{W) is a convex 
function of the weights. 

We provide comprehensive simulation experiments to demonstrate the effectiveness of our approach. 
We provide two different models of random networks with correlated Unk failures; in addition, we study 
the broadcast gossip algorithm [12], as an example of randomized protocol with asymmetric links. In all 
cases, simulations confirm that our method shows significant gain compared to the methods available in 
the literature. Also, we show that the gain increases with the network size. 

Related work. Weight optimization for consensus with switching topologies has not received much 
attention in the literature. Reference [20] studies the tradeoff between the convergence rate and the 
amount of communication that takes place in the network. This reference is mainly concerned with the 
design of the network topology, i.e., the design of the probabiUties of reUable communication {Pij} and 
the weight a (assuming all nonzero weights are equal), assuming a communication cost Cij per link 
and an overall network communication budget. Reference [12] proposes the broadcast gossip algorithm, 
where at each time step, a single node, selected at random, broadcasts unidirectionally its state to all the 
neighbors within its wireless range. We detail the broadcast gossip in subsection VI-B. This reference 
optimizes the weight for the broadcast gossip algorithm assuming equal weights for all Unks. 

The problem of optimizing the weights for consensus under a random topology, when the weights for 
different links may be different, has not received much attention in the literature. Authors have proposed 
weight choices for random or time-varying networks [23], [14], but no claims to optimality are made. 
Reference [14] proposes the Metropolis weights (MW), based on the Metropolis-Hastings algorithm for 
simulating a Markov chain with uniform equiUbrium distribution [24]. The weights choice in [23] is 
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based on the fastest mixing Markov chain problem studied in [25] and uses the information about the 
underlying supergraph. We refer to this weight choice as the supergraph based weights (SGBW). 
Notation. Vectors are denoted by a lower case letter (e.g., x) and it is understood from the context if x 
denotes a deterministic or random vector. Symbol is the A'^-dimensional Euchdean space. Inequahty 
X < y is understood element wise, i.e., it is equivalent to Xi < yi, for all i. Constant matrices are denoted 
by capital letters (e.g., X) and random matrices are denoted by calligraphic letters (e.g., X). A sequence 
of random matrices is denoted by {X{k)}'^^Q and the random matrix indexed by k is denoted X{k). If 
the distribution X{k) is the same for any k, we shorten the notation X{k) Xo X when the time instant 
k is not of interest. Symbol M^^^ denotes the set of N x M real valued matrices and denotes 
the set of symmetric real valued N x N matrices. The i-th column of a matrix X is denoted by Xi. 
Matrix entries are denoted by Xij. Quantities X 0Y, X QY, and X (BY denote the Kronecker product, 
the Hadamard product, and the direct sum of the matrices X and Y, respectively. Inequality X Y 
(X :< Y) means that the matrix X — Y is positive (negative) semidefinite. Inequality X > Y (X < Y) 
is understood entry wise, i.e., it is equivalent to Xij > Yij, for all i, j. Quantities \\X\\, Xmaxi^), 
r{X) denote the matrix 2-norm, the maximal eigenvalue, and the spectral radius of X, respectively. The 
identity matrix is /. Given a matrix A, Vec(A) is the column vector that stacks the columns of A. For 
given scalars xi,...,xn, diag {xi,...,xn) denotes the diagonal NxN matrix with the i-th diagonal entry 
equal to Xi. Similarly, diag(x) is the diagonal matrix whose diagonal entries are the elements of x. The 
matrix diag (X) is a diagonal matrix with the diagonal equal to the diagonal of X. The A'^-dimensional 
column vector of ones is denoted with 1. Symbol J = ^11^- The i-th canonical unit vector, i.e., the 
i-th column of I, is denoted by ej. Symbol l^l denotes the cardinality of a set 5". 

III. Problem model 

This section introduces the random network model that we apply to networks with hnk failures and to 
networks with randomized algorithms. It also introduces the consensus algorithm and the corresponding 
weight rule assumed in this paper. 

A. Random network model: symmetric and asymmetric random links 

We consider random networks— networks with random links or with a random protocol. Random links 
arise because of packet loss or drop, or when a sensor is activated from sleep mode at a random time. 
Randomized protocols like standard pairwise gossip [11] or broadcast gossip [12] activate links randomly. 
This section describes the network model that apphes to both problems. We assume that the links are up 
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or down (link failures) or selected to use (randomized gossip) according to spatially correlated Bernoulli 
random variables. 

To be specific, the network is modeled by a graph G = {V, E), where the set of nodes V has cardinality 
\V\ = N and the set of directed edges E, with \E\ = 2M, collects all possible ordered node pairs that 
can communicate, i.e., all realizable links. For example, with geometric graphs, realizable links connect 
nodes within their communication radius. The graph G is called supergraph, e.g., [20]. The directed edge 
(i, j) G -E if node j can transmit to node i. 

The supergraph G is assumed to be connected and without loops. For the fully connected supergraph, 
the number of directed edges (arrows) 2M is equal to N{N—1). We are interested in sparse supergraphs, 
i.e., the case when M < IN{N - 1). 

Associated with the graph G is its A^^ x iV adjacency matrix A: 



The in-neighborhood set (nodes that can transmit to node i) and the in-degree di of a node i are 



We model the connectivity of a random WSN at time step A; by a (possibly) directed random graph 

Q{k) = {V,£{k)). The random edge set is 

£{k) = G E : is online at time step k} , 

with £{k) C E. The random adjacency matrix associated to Q{k) is denoted by A{k) and the random 
in-neighborhood for sensor i by Cli{k). 

We assume that link failures are temporally independent and spatially correlated. That is, we assume 
that the random matrices A{k),k = 0, 1,2, ... are independent identically distributed. The state of the link 
at a time step A; is a Bernoulli random variable, with mean Pij, i.e., Pij is the formation probability 
of link At time step k, different edges and (p.q) may be correlated, i.e., the entries Aij{k) 

and Apq{k) may be correlated. For the link r, by which node j transmits to node i, and for the link s, 
by which node q transmits to node p, the corresponding cross-variance is 




= {j:{hj)eE} 



di — I I • 
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Time correlation, as spatial correlation, arises naturally in many scenarios, such as when nodes awake 
from the sleep schedule. However, it requires approach different than the one we pursue in this paper [19]. 
We plan to address the weight optimization with temporally correlated links in our future work. 

B. Consensus algorithm 

Let Xi{0) represent some scalar measurement or initial data available at sensor i, i = 1,...,N. Denote 
by Xavg the average: 



Xavg — ^ Xi{0) 



i=l 

The consensus algorithm computes Xavg iteratively at each sensor i by the distributed weighted average: 

Xi{k + l) = Wii{k)xi{k)+ ^ijik)xj{k) (1) 

jen,ik) 

We assume that the random weights Wij{k) at iteration k are given by: 

Wij ajG^iik) 



^-Emen.(k)^^m{k) if. = m (2) 
otherwise 



In (2), the quantities Wij are non random and will be the variables to be optimized in our work. We 
also take Wu = 0, for all i. By (2), when the link is active, the weight is Wij, and when not active it is 
zero. Note that Wij are non zero only for edges {i, j) in the supergraph G. If an edge {i, j) is not in the 
supergraph the corresponding = and Wij{k) = 0. 

We write the consensus algorithm in compact form. Let x{k) = {xi{k) X2{k) ... XN{k)Y' , W = \Wij\, 
W{k) = [Wij (A:)]. The random weight matrix W{k) can be written in compact form as 

W{k) = W® A{k) - diag {WA{k)) + / (3) 

and the consensus algorithm is simply stated with x{k = 0) = x{G) as 

x{k + 1) = W{k)x{k), k>0 (4) 

To implement the update rule, nodes need to know their random in-neighborhood 0.i{k) at every iteration. 
In practice, nodes determine 0.i{k) based on who they receive messages from at iteration k. 

It is well known [12], [15] that, when the random matrix W{k) is symmetric, the consensus algorithm 
is average preserving, i.e., the sum of the states Xj(fc), and so the average state over time, does not change. 
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even in the presence of random links. In that case the consensus algorithm converges almost surely to the 
true average x^vg- When the matrix W{k) is not symmetric, the average state is not preserved in time, 
and the state of each node converges to the same random variable with bounded mean squared error 
from Xavg [12]. For certain applications, where high precision on computing the average Xavg is required, 
average preserving, and thus a symmetric matrix yV{k) is desirable. In practice, a symmetric matrix 
W{k) can be established by protocol design even if the underlying physical channels are asymmetric. 
This can be realized by ignoring unidirectional communication channels. This can be done, for instance, 
with a double acknowledgement protocol. In this scenario, effectively, the consensus algorithm sees the 
underlying random network as a symmetric network, and this scenario falls into the framework of our 
studies of symmetric links (section IV). 

When the physical communication channels are asymmetric, and the error on the asymptotic consensus 
limit c is tolerable, consensus with an asymmetric weight matrix W(A;) can be used. This type of algorithm 
is easier to implement, since there is no need for acknowledgement protocols. An example of such a 
protocol is the broadcast gossip algorithm proposed in [12]. Section V studies this type of algorithms. 

Set of possible weight choices: symmetric network. With symmetric random links, we will always 
assume Wij = Wji. By doing this we easily achieve the desirable property that W{k) is symmetric. The 
set of all possible weight choices for symmetric random links Sw becomes: 

Sw = {We M^^^ : Wij = Wji, Wij = 0, if i E, Wu = 0, Vi, } (5) 

Set of possible weight choices: asymmetric network. With asymmetric random links, there is no 
good reason to require that Wij = Wji, and thus we drop the restriction Wij = Wji. The set of possible 
weight choices in this case becomes: 

= {We M^^^ : Wij = 0, if i E, Wu = 0, Vi, } (6) 

Depending whether the random network is symmetric or asymmetric, there will be two error quantities 
that will play a role. These will be discussed in detail in sections IV and V, respectively. We introduce 
them here briefly, for reference. 

Mean square error (MSE): symmetric network. Define the consensus error vector e{k) and the 



error co variance matrix T,{k): 

e{k) = x{k) — Xavgl (7) 

E(A;) = E [e{k)e{kf] . (8) 
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The mean squared consensus error MSE is given by: 

N 

MSE(A;) = [{xi{k) - x^y^f = E [e{kfe{k)] = tr S(A;) (9) 

1=1 

Mean square deviation (MSdev): asymmetric network. As explained, when the random Hnks are 
asymmetric (i.e., when W{k) is not symmetric), and if the underlying supergraph is strongly connected, 
then the states of all nodes converge to a common value c that is in general a random variable that 
depends on the sequence of network reahzations and on the initial state x{0) (see [15], [12]). In order 
to have c = Xavg, almost surely, an additional condition must be satisfied: 

l^W{k) = 1^, a.s. (10) 

See [15], [12] for the details. We remark that (10) is a crucial assumption in the derivation of the MSE 
decay (25). Theoretically, equation (23) is still valid if the condition W{k) = W{k)'^ is relaxed to 
l^W{k) = 1^. While this condition is trivially satisfied for symmetric links and symmetric weights 
Wij = Wji, it is very difficult to reahze (10) in practice when the random Hnks are asymmetric. So, in 
our work, we do not assume (10) with asymmetric links. 

For asymmetric networks, we follow reference [12] and introduce the mean square state deviation 
MSdev as a performance measure. Denote the current average of the node states by Xnyg{k) = jfl^x{k). 
Quantity MSdev describes how far apart different states Xi{k) are; it is given by 

TV 

MSdev(fe) = [{xi{k) - x,y,{k)f] = E [({kfak)] , 

i=l 

where 

C{k) = X{k) - X,yg{k)l = {I- J)x{k). (11) 

C. Symmetric links: Statistics ofW^k) 

In this subsection, we derive closed form expressions for the first and the second order statistics on 
the random matrix W{k). Let q{k) be the random vector that collects the non redundant entries of A{k): 

qi{k) = Aij{k), i < j, e E, (12) 

where the entries of A{k) are ordered in lexicographic order with respect to i and j, from left to right, 
top to bottom. For symmetric links, Aij{k) = Aji{k), so the dimension of q{k) is half of the number of 
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directed links, i.e., M. We let the mean and the covariance of q{k) and Vec {A{k)) be: 



TT = E[q{k)] (13) 

TTi = E[qi{k)] (14) 

Rq = Coy{q{k)) = E[ {q{k) - tt) {q{k) - nf ] (15) 

Ra = Cov{Yec{A{k))) (16) 

The relation between Rq and Ra can be written as: 

Ra = FRqF^ (17) 



where F G is the zero one selection matrix that linearly maps q{k) to Vec(^(A;)), i.e., 

Vec {A{k)) = Fq{k). We introduce further notation. Let P be the matrix of the link formation proba- 
bilities 

P=[Pij] 

Define the matrix B e M^'x^' with iV x AT zero diagonal blocks and N x N off diagonal blocks Bij 
equal to: 

Bij = lej + ejl^ 
and write W in terms of its columns W = \Wi W2 ■■■ Wp^]. We let 

Wc = Wi®W2® ... e Wn 

For symmetric random networks, the mean of the random weight matrix W{k) and of W'^{k) play an 
important role for the convergence rate of the consensus algorithm. Using the above notation, we can get 
compact representations for these quantities, as provided in Lemma 1 proved in Appendix A. 

Lemma 1 Consider the consensus algorithm (4). Then the mean and the second moment Rc of W defined 
below are: 

W = E[W] = W QP + I -(i:i&g{WP) (18) 
Rc = -E. [W^j _ (19) 

= Wc^ [RaQ {I® 11^ +11^(^1 -B)] Wc (20) 
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In the special case of spatially uncorrelated Unks, the second moment Rc of W are 

^Rc = di&g{{{n^ -P)qP){WQW)}- [U'^ -P)qPQWQW (21) 

For asymmetric random links, the expression for the mean of the random weight matrix yV{k) remains the 
same (as in Lemma 1). For asymmetric random links, instead of E [W^(fc)] — J (consider eqn. (18),(19) 
and the term E [W^(A;)] in it), the quantity of interest becomes E [W^ (/ - J) W{k)] (The quantity of 
interest is different since the optimization criterion will be different.) For symmetric links, the matrix 
E [W^] — J is a quadratic matrix function of the weights Wij; it depends also quadratically on the 
Pij's and is an affine function of [Rq]i/s. The same will still hold for E [W^ (/ - J) W(fc)] in the 
case of asymmetric random links. The difference, however, is that E [W^ (/ — J) yV(A;)] does not 
admit the compact representation as given in (19), and we do not pursue here cumbersome entry wise 
representations. In the Appendix C, we do present the expressions for the matrix E [W^ (/ — J) >V(A;)] 
for the broadcast gossip algorithm [12] (that we study in subsection VI-B). 

IV. Weight optimization: symmetric random links 

A. Optimization criterion: Mean square convergence rate 

We are interested in finding the rate at which MSE(A;) decays to zero and to optimize this rate with 
respect to the weights W. First we derive the recursion for the error e{k). We have from eqn. (4): 

l'^x{k + 1) = 1^ W{k)x{k) = l'^x{k) = l^x(O) = N Xavg 

l^e{k) = l'^x{k) - 1^ 1 Xavg = N Xavg - N Xavg = 

We derive the error vector dynamics: 

e{k + 1) = x{k + 1) - xavg 1 = W{k)x{k) - W{k) Xavg 1 = W{k)e{k) = {W{k) - J) e{k) (22) 

where the last equality holds because Je{k) = j^ll^e{k) = 0. 

Recall the definition of the mean squared consensus error (9) and the error covariance matrix in eqn. (8) 
and recall that MSE(A;) = tr S(A;) = E [e{k)e{k)'^] . Introduce the quantity 

(p{W) = Xrn^ {E[W^] - J) (23) 

The next Lemma shows that the mean squared error decays at the rate (l){W). 
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Lemma 2 (m.s.s convergence rate) Consider the consensus algorithm given by eqn. (4). Then: 

tr(S(A;+l)) = tr ((EfW^] - J) S(A;)) (24) 
tr(S(A; + l)) < tr(S(A;)), fc>0 (25) 

Proof: From the definition of the covariance S(A; + 1), using the dynamics of the error e{k + 1), 
interchanging expectation with the tr operator, using properties of the trace, interchanging the expectation 
with the tr once again, using the independence of e{k) and W{k), and, finally, noting that W{k)J = J, 
we get (24). The independence between e{k) and yV{k) follows because W(A;) is an i.i.d. sequence, 
and e{k) depends on W(0),..., W{k — 1). Then e{k) and YJ{k) are independent by the disjoint block 
theorem [26]. Having (24), eqn. (25) can be easily shown, for example, by exercise 18, page 423, [27]. 

■ 

We remark that, in the case of asymmetric random links, MSB does not asymptotically go to zero. For 
the case of asymmetric links, we use different performance metric. This will be detailed in section V. 

B. Symmetric links: Weight optimization problem formulation 

We now formulate the weight optimization problem as finding the weights Wij that optimize the mean 

squared rate of convergence: 

minimize (1)(W) 

(26) 

subject to W G Sw 

The set Sw is defined in eqn. (6) and the rate (t){W) is given by (23). The optimization problem (26) is 
unconstrained, since effectively the optimization variables are Wij G M., G E, other entries of W 
being zero. 

A point W* G Sw such that 4>{W*) < 1 will always exist if the supergraph G is connected. 
Reference [28] studies the case when the random matrices W{k) are stochastic and shows that ({){]¥*) < 1 
if the supergraph is connected and all the realizations of the random matrix yV{k) are stochastic symmetric 
matrices. Thus, to locate a point W* G Sw such that (l){W*) < 1, we just take W* that assures all the 
realizations of W be symmetric stochastic matrices. It is trivial to show that for any point in the set 

5stoch = {WeSw: Wij > 0, if (i, j) G E, W1<1}C Sw (27) 

all the realizations of W{k) are stochastic, symmetric. Thus, for any point W* G Sstoch, we have that 
0(VF*) < 1 if the graph is connected. 
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We remark that the optimum W* does not have to lie in the set Sgtoch- In general, W* lies in the set 

^conv = {WeSw: (f>{W) <l}CSw (28) 

The set -Sgtoch is a proper subset of S'conv (If W € iSgtoch then (l){W) < 1, but the converse statement is not 
true in general.) We also remark that the consensus algorithm (4) converges almost surely if 4>{W) < 1 
(not only in mean squared sense). This can be shown, for instance, by the technique developed in [28]. 

We now relate (26) to reference [29]. This reference studies the weight optimization for the case of 
a static topology. In this case the topology is deterministic, described by the supergraph G. The link 
formation probability matrix P reduces to the supergraph adjacency (zero-one) matrix A, since the hnks 
occur always if they are realizable. Also, the link covariance matrix Rq becomes zero. The weight matrix 
W is deterministic and equal to 

W = W = dmg{WA)-WQA + I 

Recall that r{X) denotes the spectral radius of X. Then, the quantities (r (W — J))^ and (f) (W) coincide. 
Thus, for the case of static topology, the optimization problem (26) that we address reduces to the 
optimization problem proposed in [29]. 

C. Convexity of the weight optimization problem 

We show that : Sw 1^+ is convex, where Sw is defined in eqn. (6) and (/){W) by eqn. (23). 

Lemma 1 gives the closed form expression of E [W^] . We see that (piW) is the concatenation of a 
quadratic matrix function and Amaxl )- This concatenation is not convex in general. However, the next 
Lemma shows that (p{W) is convex for our problem. 

Lemma 3 (Convexity of (l){W)) The function : Sw is convex. 

Proof: Choose arbitrary X,Ye Sw- We restrict our attention to matrices W of the form 

W = X + tY,teR. (29) 

Recall the expression for W given by (2) and (4). For the matrix W given by (29), we have for W = VV(i) 

W{t) = I- diag [{X + tY) A] + {X + tY) Q A (30) 
= X + ty, X = X Q A + I - di&g{XA) , y = YQA-di&g{XA) 
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Introduce the auxiliary function rj : M. ^ M.^, 

V{t) = Ama. (E [W{tf] - J) (31) 

To prove that (t>(W) is convex, it suffices to prove that the function cp is convex. Introduce Z{t) and 
compute successively 

Z{t) = W{tf-J (32) 

= {X + tyf-J (33) 

= t^y'^ + t {xy + yx) + x'^-j (34) 

= t'^Z2 + tZi + Zo (35) 

The random matrices Z2, Z\, and Zq do not depend on t. Also, Z2 is semidefinite positive. The function 
riit) can be expressed as 

n{t) = \^^ (E[^(i)]) 

We will now derive that 

Z ((1 - a)t + au) ^ (1 - a) Z(fy + aZ{u), Va e [0, 1] , Vt, u G M (36) 
Since r]{t) = is convex, the following inequality holds: 

[(1 - a)t + auf < (1 - a)t^ + au^, a G [0, 1] (37) 
Since the matrix Z2 is positive semidefinite, eqn. (37) implies that: 

(^{{l-a)t + auf^Z2 ^ {l-a)t^Z2 + au'^Z2, a G [0,1] 

After adding to both sides ((1 — a)t + au) Z\ + Zq, we get eqn. (36). Taking the expectation to both 
sides of (36), get: 

E[^((l -a)i + a«)] < E[(l-a)Z(t) + aZ(u)] 

= (1 -a)E[Z(t)] + aE[2:(u)] , a G [0,1] 
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Now, we have that: 



ri{{l-a)t + au) = \rm^{E[Z {{1 - a)t + au)]) 



< An,ax((l-a)E[Z(t)] + aE[Z(n)]) 



< (1 - a) A^ax ( E [Z{t)] ) + a A^ax ( E (u)] ) 



= (1 - a) i]{t) + a r]{u), a G [0, 1] 



The last inequahty holds since Ainax( ) is convex. This implies r]{t) is convex and hence (p{W) is convex. 

■ 

We remark that convexity of ^(1^) is not obvious and requires proof. The function (f){W) is a concate- 
nation of a matrix quadratic function and Ainax( )- Although the function Ainax( ) is a convex function 
of its argument, one still have to show that the following concatenation is convex: W E[W^] — J 



D. Fully connected random network: Closed form solution 

To get some insight how the optimal weights depend on the network parameters, we consider the 
impractical, but simple geometry of a complete random symmetric graph. For this example, the opti- 
mization problem (26) admits a closed form solution, while, in general, numerical optimization is needed 
to solve (26). Although not practical, this example provides insight how the optimal weights depend 
on the network size N, the hnk formation probabilities, and the hnk formation spatial correlations. 
The supergraph is symmetric, fully connected, with nodes and M = N{N — l)/2 undirected links. 
We assume that all the links have the same formation probability, i.e., that Prob {qi = 1) = tt/ = 
p € (0, 1], / = 1, M. We assume that the cross-variance between any pair of links i and j equals to 
= I5p{l — p), where (5 is the correlation coefficient. The matrix Rq is given by 



The eigenvalues of Rq are \i{Rq) = p{l -p){l + {M- 1) /?), and \i{Rq) = p{l -p){l-(5) > 0, 
i = 2, M. The condition that Rq t implies that f3 > -1/(M - 1). Also, we have that 



0W = A„ax(E[>V2]- J). 



Rq = p{i-p) [{i-p)i + pn'^]. 



(3 := 



EM -Efe]E[g,-] 
\/Var(gi)^Var(gj) 
Pioh {qj = l,qj = l)-p^ 



- 1-p 



(38) 



(39) 



p{l -p) 
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Thus, the range of (3 is restricted to 

Due to the problem symmetry, the optimal weights for all hnks are the same, say W* . The expressions 
for the optimal weight W* and for the optimal convergence rate <j)* can be obtained after careful 
manipulations and expressing the matrix E [W^] — J explicitly in terms of p and (3; then, it is easy 
to show that: 

W* = (41) 

The optimal weight W* decreases as (3 increases. This is also intuitive, since positive correlations imply 
that the links emanating from the same node tend to occur simultaneously, and thus the weight should be 
smaller. Similarly, negative correlations imply that the links emanating from the same node tend to occur 
exclusively, which results in larger weights. Finally, we observe that in the uncorrelated case (/3 = 0), 
as N becomes very large, the optimal weight behaves as l/{Np). Thus, for the uncorrelated links and 
large network, the optimal strategy (at least for this example) is to rescale the supergraph-optimal weight 
1/iV by its formation probabihty p. Finally, for fixed p and N, the fastest rate is achieved when (3 is as 
negative as possible. 

E. Numerical optimization: subgradient algorithm 

We solve the optimization problem in (26) for generic networks by the subgradient algorithm, [30]. 
In this subsection, we consider spatially uncorrelated links, and we comment on extensions for spatially 
correlated links. Expressions for spatially correlated links are provided in Appendix B. 

We recall that the function (t){W) is convex (proved in Section IV-C). It is nonsmooth because Amax( ) 
is nonsmooth. Let i7 G be the subgradient of the function (t>{W). To derive the expression for the 
subgradient of (f>{W), we use the variational interpretation of 

^{W) = max (E [W^] - j) v = max f^{W) (43) 

v''"v=l v''"v=l 

By the subgradient calculus, a subgradient of (l){W) at point W is equal to a subgradient of the 
function fu{W) for which the maximum of the optimization problem (43) is attained, see, e.g., [30]. The 
maximum of fv{W) (with respect to v) is attained at v = u, where u is the eigenvector of the matrix 
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E [W^] — J that corresponds to its maximal eigenvalue, i.e., the maximal eigenvector In our case, the 
function fu{W) is differentiable (quadratic function), and hence the subgradient of fu{W) (and also the 
subgradient of (t>{W)) is equal to the gradient of fu{W), [30]: 



(44) 
otherwise. 



We compute for (i,j) G E 



= [-2WPij{ei - ej){ei - ejf + AWij Pij{l - Pij){ei - ej){ei - ejf^ u 

= 2Pij- {ui - u^)u^^^ - Wi) + 4Pij (1 - Pi^)Wi^ {m -Ujf (46) 

The subgradient algorithm is given by algorithm 1. The stepsize is nonnegative, diminishing, and 



d[W^ -J^Rc 



Algorithm 1: Subgradient algorithm 

Set initial W'^^^ G Sw 

Set A; = 1 

Repeat 

Compute a subgradient R^^^ of at W^^\ and set = W^^^ - a^R^^^ 

k:=k+l 



nonsummable: limfe-^.ooQ!it = 0, Ylk^i '^k = We choose ccfe = k = 1,2, similarly as in [29]. 

V. Weight optimization: asymmetric random links 

We now address the weight optimization for asymmetric random networks. Subsections V-A and V-B 
introduce the optimization criterion and the corresponding weight optimization problem, respectively. 
Subsection V-C shows that this optimization problem is convex. 

A. Optimization criterion: Mean square deviation convergence rate 
Introduce now 

^{W) := Xra^ (E [W^ (/ - J) W] ) . (47) 
Reference [12] shows that the mean square deviation MSdev satisfies the following equation: 

MSdev(A; + 1) < ^{W) MSdev(A;). (48) 
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Thus, if the quantity ip{W) is strictly less than one, then MSdev converges to zero asymptotically, with 
the worst case rate equal to ip{W). We remark that the condition (10) is not needed for eqn. (48) to 
hold, i.e., MSdev converges to zero even if condition (10) is not satisfied; this condition is needed only 
for eqn. (25) to hold, i.e., only to have MSE to converge to zero. 

B. Asymmetric network: Weight optimization problem formulation 

In the case of asymmetric links, we propose to optimize the mean square deviation convergence rate, 
i.e., to solve the following optimization problem: 



The constraints in the optimization problem (49) assure that, in expectation, condition (10) is satisfied, 
i.e., that 



If (50) is satisfied, then the consensus algorithm converges to the true average Xavg in expectation [12]. 

Equation (50) is a linear constraint with respect to the weights Wij, and thus does not violate the 
convexity of the optimization problem (49). We emphasize that in the case of asymmetric links, we do 
not assume the weights Wij and Wji to be equal. In section VI-B, we show that allowing Wij and Wji 
to be different leads to better solutions in the case of asymmetric networks. 

C. Convexity of the weight optimization problem 

We show that the function iIj{W) is convex. We remark that reference [12] shows that the function is 
convex, when all the weights Wij are equal to g. We show here that this function is convex even when 
the weights are different. 

Lemma 4 (Convexity ofip{W)) The function (p : S^'^ — > M_|_ is convex. 

Proof: The proof is very similar to the proof of Lemma 3. The proof starts with introducing W as 
in eqn. (29) and with introducing W{t) as in eqn. (30). The difference is that, instead of considering the 
matrix — J, we consider now the matrix (/ — J) W. In the proof of Lemma 3, we introduced 
the auxiliary function r){t) given by (31); here, we introduce the auxihary function K{t), given by: 



minimize ip{W) 
subject to W e 5ff?"" 



(49) 



j:i=iPijWij = l, z = l,...,N 



1^E[W] = 1^. 



(50) 



nit) = A^ax [Witfil - J)W) 



(51) 
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and show that ijj{W) is convex by proving that K{t) is convex. Then, we proceed as in the proof of 
Lemma 3. In eqn. (35) the matrix Z2 becomes Z2 := y^{I — J)y. The random matrix Z2 is obviously 



We demonstrate the effectiveness of our approach with a comprehensive set of simulations. These 
simulations cover both examples of asymmetric and symmetric networks and both networks with random 
link failures and with randomized protocols. In particular, we consider the following two standard sets 
of experiments with random networks: 1) spatially correlated link failures and symmetric links and 2) 
randomized protocols, in particular, the broadcast gossip algorithm [12]. With respect to the first set, we 
consider correlated link failures with two types of correlation structure. We are particularly interested in 
studying the dependence of the performance and of the gains on the size of the network N and on the 
link correlation structure. 

In all these experiments, we consider geometric random graphs. Nodes communicate among themselves 
if within their radius of communication, r. The nodes are uniformly distributed on a unit square. The 
number of nodes is AT = 100 and the average degree is lb%N. In subsection VI-A, the random 
instantiations of the networks are undirected; in subsection VI-B, the random instantiations of the networks 
are directed. 

In the first set of experiments with correlated link failures, the link formation probabilities Pij are 
chosen such that they decay quadratically with the distance: 



where we choose k = 0.7. We see that, with (52), a hnk will be active with high probabihty if the nodes 
are close {5ij 2± 0), while the link will be down with probabihty at most 0.7, if the nodes are apart by r. 

We recall that we refer to our weight design, i.e., to the solutions of the weight optimization prob- 
lems (26), (49), as probability based weights (PBW). We study the performance of PBW, comparing it 
with the standard weight choices available in the hterature: in subsection VI-A, we compare it with the 
Metropohs weights (MW), discussed in [29], and the supergraph based weights (SGBW). The SGBW are 
the optimal (nonnegative) weights designed for a static (nonrandom) graph G, which are then applied to a 
random network when the underlying supergraph is G. This is the strategy used in [23]. For asymmetric 
links (and for asymmetric weights Wij / Wji), in subsection VI-B, we compare PBW with the optimal 
weight choice in [12] for broadcast gossip that considers all the weights to be equal. 



positive semidefinite. The proof then proceeds as in Lemma 3. 



VI. Simulations 




(52) 
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In the first set of experiments in subsection VI-A, we quantify the performance gain of PBW over 
SGBW and MW by the gains: 



where r is a time constant defined as: 



TPBW 
1 

0.5ln(f){W) 



(54) 



We also compare PBW with SGBW and MW with the following measure: 



n = (55) 



rfPBW 

??PBW 



r;' = ^ (56) 



where 77 is the asymptotic time constant defined by 

' fe-.oo VI|e(o)||y 

Reference [23] shows that for random networks 77 is an almost sure constant and r is an upper bound 

on 7]. Also, it shows that r is an upper bound on rj. 

Subsections VI-A and VI-B will provide further details on the expermints. 

A. Symmetric links: random networks with correlated link failures 

To completely define the probability distribution of the random link vector q G M^, we must assign 
probability to each of the 2^ possible realizations of q, q= {ai, um)'^, € {0, 1}. Since in networks 
of practical interest M may be very large, of order 1000 or larger, specifying the complete distribution of 
the vector q is most hkely infeasible. Hence, we work with the second moment description and specify 
only the first two moments of its distribution, the mean and the covariance, tt and Rq. Without loss of 
generality, order the links so that vri < 7r2 < ... < ttm- 

Lemma 5 The mean and the variance (tt, Rq) of a Bemoulh random vector satisfy: 

< TTi < 1, i = l,...,N (59) 

Rq ^ ^ (60) 
max {—KiTTj, TTi + TTj -1- TTj TTj) < [Rq\j^j < TTj (1 - TTj) = Rij , 1 < j (61) 
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Proof: Equations (59) and (60) must hold because tt^'s are probabilities and Rq is a covariance 
matrix. Recall that 

[Rq\ij = E [Qiqj] - E [qi\ E fe] = Prob (g^ = 1 , g^- = 1) - TTiTT,- . (62) 

To prove the lower bound in (61), observe that: 

Prob {qi = l, Qj = 1) = Prob = 1) + Prob {qj = 1) - Prob {{qi = 1} or {qj = 1}) 

= TTj + TTj — Prob ({gj = 1} or {qj = 1}) > TTj + TTj - 1. (63) 

In view of the fact that Prob(gi = 1, qj = 1) > 0, eqn. (63), and eqn. (62), the proof for the lower 
bound in (61) follows. The upper bound in (61) holds because Prob(gj = 1, qj = 1) < tTj, i < j and 
eqn. (62). ■ 
If we choose a pair (tt, Rq) that satisfies (59), (60), (61), one cannot guarantee that (tt, Rq) is a valid pair, 
in the sense that there exists a probability distribution on q with its first and second moments being equal 
to (tt, Rq), [31]. Furthermore, if (tt, Rq) is given, to simulate binary random variables with the marginal 
probabilities and correlations equal to (vr, Rq) is challenging. These questions have been studied, see [32], 
[31]. We use the results in [32], [31] to generate our correlation models. In particular, we use the result 
that R = [R'ij] (see eqn. (61)) is a valid correlation structure for any vr, [32]. We simulate the correlated 
hnks by the method proposed in [31]; this method handles a wide range of different correlation structures 
and has a small computational cost. 

Link correlation structures. We consider two different correlation structures for any pair of links i and 
j in the supergraph: 

[R,]ij = ciRij (64) 

[Rq]ij = C2e^^^Rij (65) 

where ci G (0, 1], 9 G (0, 1) and C2 G (0, 1] are parameters, and Kij is the distance between links i and 
j defined as the length of the shortest path that connects them in the supergraph. 

The correlation structure (64) assumes that the correlation between any pair of links is a fraction of 
the maximal possible correlation, for the given tt (see eqn. (61) to recall Rij). Reference [32] constructs 
a method for generating the correlation structure (64). 

The correlation structure (65) assumes that the correlation between the links decays geometrically with 
this distance . In our simulations, we set 9 = 0.95, and find the maximal C2, such that the resulting 
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correlation structure can be simulated by the method in [31]. For all the networks that we simulated in 
the paper, C2 is between 0.09 and 0.11. 

Results. We want to address the following two questions: 1) What is the performance gain (Tg, Tm in 
eqns. (??), (??)) of PBW over SGBW and MW; and 2) How does this gain scale with the network size, 
i.e., the number of nodes A^? 

Performance gain of PBW over SGBW and MW. We consider question 1) for both correlation 
structures (64), (65). We generate 20 instantiations of our standard supergraphs (with 100 nodes each and 
approximately the same average relative degree, equal to 15%). Then, for each supergraph, we generate 
formation probabilities according to rule (52). For each supergraph with the given formation probabilities, 
we generate two link correlation structures, (64) and (65). We evaluate the convergence rate (pj given 
by (25), time constants r]j given by (57), and tj, given by (54), and the performance gains [^m\j for 

each supergraph (j = 1, 20). We compute the mean (p, the maximum 0+ and the minimum (p" from 
the list {pj}, j = 1,...,20 (and similarly for {rjj} and {tj}, j = 1,...,20). Results for the correlation 
structure (64) are given in Table 1 and for the correlation structure (65), in Table 2. The performance 
gains Tg, Tm, for both correlation structures are in Table 3. In addition. Figure 1 depicts the averaged 
error norm over 100 sample paths. We can see that the PBW outperform the SGBW and the MW for 
both correlation structures (64) and (65). For example, for the correlation (64), the PBW take less than 
40 iterations to achieve 0.2% precision, while the SGBW take more than 70, and the MW take more than 
80 iterations. For correlation (65), to achieve 0.2% precision, the PBW take about 47 iterations, while 
the SGBW and the MW take more than 90 and 100 iterations, respectively. The average performance 




40 60 
iteration number 



100 




40 60 
iteration number 



100 



Fig. 1. Average error norm versus iteration number. Left: correlation structure (64); rigfit: correlation structure (65). 



gain of PBW over MW is larger than the performance gain over SGBW, for both (64) and (65). The 
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TABLE I 

Correlation structure (64): Average (•), maximal 
(•)"'', and minimal (•)" values of the msb 

convergence rate (f> (23), AND CORRESPONDING TIME 
CONSTANTS T (54) AND rj (57), FOR 20 GENERATED 
SUPERGRAPHS 



TABLE II 

Correlation structure (65): Average (•), maximal 

(•)"'', AND minimal (•)" VALUES OF THE MSE 
convergence rate (f) (23), AND CORRESPONDING TIME 
CONSTANTS T (54) AND r/ (57), FOR 20 GENERATED 
SUPERGRAPHS 





SGBW 


PBW 


MW 




SGBW 


PBW 


MW 




0.91 


0.87 




<^ 


0.92 


0.86 




0+ 


0.95 


0.92 




0+ 


0.94 


0.90 






0.89 


0.83 






0.91 


0.84 




T 


22.7 


15.4 




T 


25.5 


14.3 




T+ 


28 


19 




T+ 


34 


19 




T~ 


20 


14 




T~ 


21 


12 




V 


20 


13 


29 


V 


20 


11.5 


24.4 


r,+ 


25 


16 


38 


r,+ 


23 


14 


29 


V 


19 


12 


27 


V 


16 


9 


19 



TABLt! Ill 



Average (•), maximal (•)"'", and minimal (•)" 
performance gains and f^, (55) for the two 
correlation structures (64) and (65) for 20 
generated supergraphs 





Correlation (64) 


Correlation (65) 


(H) 


1.54 


1.73 




1.66 


1.91 




1.46 


1.58 




2.22 


2.11 




2.42 


2.45 




2.07 


1.92 



gain over SGBW, F^, is significant, being 1.54 for (64) and 1.73 for (65). The gain with the correlation 
Structure (65) is larger than the gain with (64), suggesting that larger gain over SGBW is achieved with 
smaller correlations. This is intuitive, since large positive correlations imply that the random links tend 
to occur simultaneously, i.e., in a certain sense random network realizations are more similar to the 
underlying supergraph. 

Notice that the networks with Rq as in (65) achieve faster rate than for (64) (having at the same 
time similar supergraphs and formation probabilities). This is in accordance with the analytical studies 
in section IV-D that suggest that faster rates can be achieved for smaller (or negative correlations) if G 
and TT are fixed. 

Performance gain of PBW over SGBW as a function of the network size. To answer question 2), we 
generate the supergraphs with ranging from 30 up to 160, keeping the average relative degree of the 
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supergraph approximately the same (15%). Again, PBW performs better than MW (tsgbw < 0.85rMw)> 
so we focus on the dependence of Tg on N, since it is more critical. 

Figure 2 plots Tg versus A'^, for the two correlation structures. The gain Tg increases with N for 
both (65) and (64). 



o 
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Correlation in eqn.(65)- / 




^o^^. 


in eqn.(64>>^ / 













40 



60 80 100 120 140 160 
number of nodes 



Fig. 2. Performance gain of PBW over SGBW (F^, eqn. (55)) as a function of the number of nodes in the network. 



B. Broadcast gossip algorithm [12 ]: Asymmetric random links 

In the previous section, we demonstrated the effectiveness of our approach in networks with random 
symmetric hnk failures. This section demonstrates the validity of our approach in randomized proto- 
cols with asymmetric links. We study the broadcast gossip algorithm [12]. Although the optimization 
problem (49) is convex for generic spatially correlated directed random links, we pursue here numerical 
optimization of the broadcast gossip algorithm proposed in [12], where, at each time step, node i is 
selected at random, with probability 1/N. Node i then broadcasts its state to all its neighbors within its 
wireless range. The neighbors then update their state by performing the weighted average of the received 
state with their own state. The nodes outside the set fij and the node i itself keep their previous state 
unchanged. The broadcast gossip algorithm is well suited for WSN applications, since it exploits the 
broadcast nature of wireless media and avoids bidirectional communication [12]. 

Reference [12] shows that, in broadcast gossiping, all the nodes converge a.s. to a common random 
value c with mean x^vg and bounded mean squared error. Reference [12] studies the case when the weights 
Wij = g, G E and finds the optimal g = g* that optimizes the mean square deviation MSdev (see 

eqn. (49)). We optimize the same objective function (see eqn. (49)) as in [12], but allowing different 
weights for different directed links. We detail on the numerical optimization for the broadcast gossip 
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in the Appendix C. We consider again the supergraph G from our standard experiment with = 100 
and average degree 15%A^. For the broadcast gossip, we compare the performance of PBW with 1) the 
optimal equal weights in [12] with Wij = g*, G E; 2) broadcast gossip with Wij = 0.5, G E. 

Figure 3 (left) plots the consensus mean square deviation MSdev for the 3 different weight choices. 
The decay of MSdev is much faster for the PBW than for Wij = 0.5, ^ and Wij = g*, V(i, j). For 
example, the MSdev falls below 10% after 260 iterations for PBW (i.e., 260 broadcast transmissions); 
broadcast gossip with Wij = g* and Wij =0.5 take 420 transmissions to achieve the same precision. 
This is to be expected, since PBW has many moredegrees of freedom for to optimize than the broadcast 
gossip in [12] with all equal weights Wij = g* . Figure 3 (right) plots the MSE, i.e., the deviation of 
the true average Xavg, for the three weight choices. PBW shows faster decay of MSE than the broadcast 
gossip with Wij = g* and Wij =0.5. The weights provided by PBW are different among themselves, 




Fig. 3. Broadcast gossip algorithm with different weight choices. Left: total variance; right: total mean squared error 

varying from 0.3 to 0.95. The weights Wij and Wji are also different, where the maximal difference 
between Wij and Wji, G E, is 0.6. Thus, in the case of directed random networks, asymmetric 

matrix W results in faster convergence rate. 

VII. Conclusion 

In this paper, we studied the optimization of the weights for the consensus algorithm under random 
topology and spatially correlated links. We considered both networks with random link failures and 
randomized algorithms; from the weights optimization point of view, both fit into the same framework. We 
showed that, for symmetric random links, optimizing the MSE convergence rate is a convex optimization 
problem, and , for asymmetric links, optimizing the mean squared deviation from the current average state 
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is also a convex optimization problem. We illustrated with simulations that the probability based weights 
(PBW) outperform previously proposed weights strategies that do not use the statistics of the network 

randomness. The simulations also show that, using the link quality estimates and the link correlations 
for designing the weights significantly improves the convergence speed, typically reducing the time to 
consensus by one third to a half, compared to choices previously proposed in the literature. 

Appendix 1 
Proof of Lemma 1 (a sketch) 

Eqn. (18) follows from the expectation of (3). To prove the remaining of the Lemma, we find W^, 
W , and the expectation W . We obtain successively: 

= {WQA + I-di&g{WA)f 

= {WqA)^+ diag'^iWA) + 1 + 2W QA-2diag{W A) - {W Q A) diag{W A) 
- diag(W^^)(^0^) 

= {WQPf+dmg'^{WP)+I + 2WQP-2dmg{WP) 
-[{WQP) diag(VFP) + diag(VFP) {WQP)] 
E[W^] = E[{WQAf]+'E[di&g'^{WA)]+I + 2WQP 

-2diag{W P) - mw Q A) diag{W A) + diag{W A) {W Q A)] 

We will next show the following three equalities: 

E [{W Af] =iWQ Pf + Wc'^ {Ra (11^ ®I)]Wc (66) 
E [diag^ {WA)] =dia.g^{WP) + Wc^ [Ra ® {I ® 11^)} Wc (67) 
E [{W ^)diag {WA) + diag {WA) {W A)] = (68) 

{W P)diag (VFP) +diag {WP) {WQP)- Wc'^ {Ra B} Wc 

First, consider (66) and find E (1^0^)^ . Algebraic manipulations allow to write (1^ as 
follows: 

{W qA)"^ = Wc'^ {A2Q{ll^ ®I)]Wc, A2 = ^ec{A)\eJ{A) (69) 
To compute the expectation of (69), we need E [ ^2 ] that can be written as 

E [^2 ] = ^2 + with P2 = Vec( P) Vec^( P). 
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Equation (66) follows, realizing that 

W^C^{f2 0(ll^<8)/)} Wc = iWQPf. 

Now consider (67) and (68). After algebraic manipulations, it can be shown that 

diag^iWA) = Wc'^ {A2Q{l0n'^)}Wc 
{WQ A) ding {W A) + ding {W A) {WQ A) = Wc^{A2QB}Wc 

Computing the expectations in the last two equations leads to eqn. (67) and eqn. (68). 

2 

Using equalities (66), (67), and (68) and comparing the expressions for W and E[ ] leads to: 



Rc = E[W^]-W =Wc'^ { i?A0(I® ll'^ + ll^®/-^)} Wc 



(70) 



This completes the proof of Lemma 1 . 



Appendix II 

SUBGRADIENT STEP CALCULATION FOR THE CASE OF SPATIALLY CORRELATED LINKS 
To compute the subgradient H, from eqns. (44) and (45) we consider the computation of E [W^ ~ = 

2 2 

W — J + Rc- Matrix W — J is computed in the same way as for the uncorrelated case. To compute 
Rc, from (70), partition the matrix Ra into N x N blocks: 

^ i?ll Ri2 ■ ■ ■ RlN ^ 



Ra = 



i?2l R22 



R2N 



y -Rjvi Rn2 ■ ■ ■ Rnn / 



Denote by dij, by c^j, and by r'^ the diagonal, the l-th column, and the Z-th row of the block Rij. It 



can be shown that the matrix Rc can be computed as follows: 



[Rc]u = Wl{duQWi) + WfRuWi 
Denote by Ra{-, k) the k-ih column of the matrix Ra and by 

ki = (ej In) Ra{:, {i - 1)N + j), k2 = (ef In) Ra{:, {j - l)iV + i), 

fca = {eJ<^lN)RA{:,{i-l)N + j), k4={eJ<S)lN)RA{:,U-l)N + i). 
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Quantities ki, k2, k3 and k4 depend on but for the sake of the notation simplicity indexes are 
omitted. It can be shown that the computation of Hij, e E boils down to: 



Hi. 



2 uf Wf 4 + 2 Wf 4- + 2ui W'f {u Qki) + 2 uj W'f {u ^2) - 2 m uj W'f d 



T 

3 ^ji 



2 Ui Uj Wl 4j -2uiWl{uQ ks) - 2 uj Wj {u /C4) + 2 Pij {m - Uj) {Wj -Wi^ 

Appendix III 

Numerical optimization for the broadcast gossip algorithm 

With broadcast gossip, the matrix W(A;) can take N different reahzations, corresponding to the 
broadcast cycles of each of the N sensors. We denote these realizations by W^'\ where i indexes the 
broadcasting node. We can write the random realization of the broadcast gossip matrix W^*), i = \, ...,N, 
as follows: 

>V» {k) = WQ A^^{k) + 1- diag (W ^« (fe)) , (71) 

where A[f{k) = 1, if / e Qi. Other entries of A^^{k) are zero. 

Similarly in Appendix A, we can arrive at the expressions for E [W^W] := E [VV^(A;)W(A;)] and for 
E [W'^JW] := E [>V'^(A;)J>V(A;)], for all k. We remark that the matrix W needs not to be symmetric 
for the broadcast gossip and that Wij = 0, if ^ E. 



1=1,1^^1 



E (W^W).^. 
E [W^JW).. 



1=1,1^^1 
1 



\ Ij^i / l=l,l^i 



+ 



iV2 

1 
iV2 

1 

iV2 



N 



l=l,l^i 



N 



Denote by W^^ := E [W^W] - E [W'^JW] and recall the definition of the MSdev rate ipiW) (47). 
We have that il^{W) = Xmax{W^'^). We proceed with the calculation of the subgradient of ip{W) 
similarly as in subsection IV-E. The partial derivative of the cost function ^{W) with respect to weight 
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Wi j is given by: 

d 



where q is eigenvector associated with the maximal eigenvalue of the matrix W^^*^. Finally, partial 
derivatives of the entries of the matrix with respect to weight Wij are given by the following set 
of equations: 



dWij''' N ^ '''' 

d 2 2^ 



d 



N 



— — Wi^G ^ Q otherwise. 
dW] "-'^ 



l,m 
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